Two-Level, Four Factor Factorial Design

Suppose we're given the results of a $2^4$ factorial design, which yields the following contrasts:

$$ A \leftarrow 0.4 \\ B \leftarrow -7.6 \\ C \leftarrow 14.1 \\ D \leftarrow 66.7 \\ AB \leftarrow 16.7 \\ AC \leftarrow 3.1\\ AD \leftarrow 5.2 \\ BC \leftarrow 8.3 \\ BD \leftarrow -3.6 \\ CD \leftarrow 14.3 \\ ABC \leftarrow -0.1 \\ ABD \leftarrow -4.7 \\ ACD \leftarrow 7.7 \\ BCD \leftarrow -2.3 \\ ABCD \leftarrow 3.9 \\ $$

(This follows question 4.3 in Box and Draper.)

We'll use quantile plots to interpret the results - we can't go much deeper than that, since the problem does not give the vaues of input or response variables.

We'll start by importing some libraries, then populate a dictionary with the various main and interation effects (the only assumption is that the response has a mean of 0, x0 = 0.0).


In [1]:
%matplotlib inline

import pandas as pd
import numpy as np
from numpy.random import rand, seed
import seaborn as sns
import scipy.stats as stats
from matplotlib.pyplot import *

In [2]:
effects = {}
effects[0] = {'x0': 0.0}
effects[1] = {'x1': 0.4,
              'x2': -7.6,
              'x3': 14.1,
              'x4': 66.7}

effects[2] = {'x1-x2': 16.7,
              'x1-x3': 3.1,
              'x1-x4': 5.2,
              'x2-x3': 8.3,
              'x3-x4': 14.3}
effects[3] = {'x1-x2-x3': -0.1,
              'x1-x2-x4': -4.7,
              'x1-x3-x4': 7.7,
              'x2-x3-x4': -2.3}

effects[4] = {'x1-x2-x3-x4': 3.9}

Now we can use that dictionary to create a dataframe (this is a bit more work than it needs to be, but this is showing you how might structure the data for more complicated interactions). The important thing is that we have a labeled list that's a Pandas DataFrame:


In [3]:
master_dict = {}
for nvars in effects.keys():

    effect = effects[nvars]
    for k in effect.keys():
        v = effect[k]
        master_dict[k] = v

master_df = pd.DataFrame(master_dict,index=['dy']).T
master_df


Out[3]:
dy
x0 0.0
x1 0.4
x1-x2 16.7
x1-x2-x3 -0.1
x1-x2-x3-x4 3.9
x1-x2-x4 -4.7
x1-x3 3.1
x1-x3-x4 7.7
x1-x4 5.2
x2 -7.6
x2-x3 8.3
x2-x3-x4 -2.3
x3 14.1
x3-x4 14.3
x4 66.7

In [4]:
#print help(master_df.sort)
view = master_df.sort_values(by='dy',ascending=False)
view


Out[4]:
dy
x4 66.7
x1-x2 16.7
x3-x4 14.3
x3 14.1
x2-x3 8.3
x1-x3-x4 7.7
x1-x4 5.2
x1-x2-x3-x4 3.9
x1-x3 3.1
x1 0.4
x0 0.0
x1-x2-x3 -0.1
x2-x3-x4 -2.3
x1-x2-x4 -4.7
x2 -7.6

This table shows that one variable, $x_4$, has a huge effect that outshadows all other effects. In case it wasn't obvious already, the quantile-quantile plot makes it crystal clear:


In [5]:
# Quantile-quantile plot of effects:

fig = figure(figsize=(4,4))
ax1 = fig.add_subplot(111)

stats.probplot(master_df['dy'], dist="norm", plot=ax1)
ax1.set_title('Effect Size: Quantile-Quantile Plot')
show()


The largest ordered value in the effects, which is the effect of $x_4$, is way off by itself. In this region of operational state-space, $x_4$ dominates the system's behavior.


In [ ]: